Extended Description of the Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling

نویسندگان

  • Johann Schaible
  • Thomas Gottron
  • Ansgar Scherp
چکیده

Modeling and publishing Linked Open Data (LOD) involves the choice of which vocabulary to use. This choice is far from trivial and poses a challenge to a Linked Data engineer. It covers the search for appropriate vocabulary terms, making decisions regarding the number of vocabularies to consider in the design process, as well as the way of selecting and combining vocabularies. Until today, there is no study that investigates the different strategies of reusing vocabularies for LOD modeling and publishing. In this paper, we present the results of a survey with 79 participants that examines the most preferred vocabulary reuse strategies of LOD modeling. Participants of our survey are LOD publishers and practitioners. Their task was to assess different vocabulary reuse strategies and explain their ranking decision. We found significant differences between the modeling strategies that range from reusing popular vocabularies, minimizing the number of vocabularies, and staying within one domain vocabulary. A very interesting insight is that the popularity in the meaning of how frequent a vocabulary is used in a data source is more important than how often individual classes and properties are used in the LOD cloud. Overall, the results of this survey help in understanding the strategies how data engineers reuse vocabularies, and they may also be used to develop future vocabulary engineering tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling

The choice of which vocabulary to reuse when modeling and publishing Linked Open Data (LOD) is far from trivial. There is no study that investigates the different strategies of reusing vocabularies for LOD modeling and publishing. In this paper, we present the results of a survey with 79 participants that examines the most preferred vocabulary reuse strategies of LOD modeling. The participants,...

متن کامل

TermPicker: Recommending Vocabulary Terms for Reuse When Modeling Linked Open Data

Linked Open Data (LOD) refers to data published on the Web in a way that it is machine-readable, its meaning is explicitly defined, and it is linked to other data sets. So-called Resource Description Framework (RDF) vocabularies are employed for LOD modeling. An RDF vocabulary is a collection of unique vocabulary terms comprising classes, which describe the type of a data entity, and properties...

متن کامل

A Quantitative Survey on the Use of the Cube

There is a striking increase in the availability of statistical data in the Linked Open Data (LOD) cloud, and the Cube vocabulary has become the de facto standard for the description of multi-dimensional data. However, the reuse of a standard vocabulary needs to pair with modeling strategies that make it easy to locate, consume and integrate information. In this paper, we developed a quantitati...

متن کامل

Towards a Vocabulary for Incorporating Predictive Models into the Linked Data Web

Predictive modeling reflects the process of using data and statistical or data mining methods for predicting new observations. The predictive models that are created out of this process could be reused in different applications in the same sense that open data is reused. Towards this end, a few standards have been proposed in order to enable transfer of predictive models across platforms and ap...

متن کامل

TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from the Linked Open Data Cloud - An Extended Technical Report

Deciding which vocabulary terms to use when modeling data as Linked Open Data (LOD) is far from trivial. Choosing too general vocabulary terms, or terms from vocabularies that are not used by other LOD datasets, is likely to lead to a data representation, which will be harder to understand by humans and to be consumed by Linked data applications. In this technical report, we propose TermPicker:...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014